Pre-Training Of Bert-Based Transformer Architectures Explained - Language And Vision!